Natural Language Processing for Cultural Heritage Domains

نویسنده

  • Caroline Sporleder
چکیده

Museums, archives, libraries and other cultural heritage institutes maintain large collections of artefacts which are valuable knowledge sources for both experts and interested lay persons. Recently, more and more cultural heritage institutes have started to digitise their collections, for instance to make them accessible via web portals. However, while digitisation is a necessary first step towards improved information access, to fully unlock the knowledge contained in these collections, users have to be able to easily browse, search and query these collections. This requires cleaning, linking and enriching the data, a process that is often too time-consuming to be performed manually. Information technology can help with (partially) automating this task. Since data processing and enrichment typically involve the textual metadata level, natural language processing has a key role to play in this endeavour. At the same time cultural heritage domains pose significant challenges for language technology and call for the development of very robust and flexible solutions. Consequently, cultural heritage data can also serve as a good test-bed for the development of robust natural language processing tools.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Workshop on Language Technology for Cultural Heritage , Social Sciences , and Humanities

ii Preface The LaTeCH (Language Technology for Cultural Heritage, Social Sciences, and Humanities) annual workshop series aims to provide a forum for researchers who are working on aspects of natural language and information technology applications that pertain to data from the humanities, social sciences, and cultural heritage. The LaTeCH workshops were initially motivated by the growing inter...

متن کامل

Voice knowledge acquisition system for the management of cultural heritage

This document presents our work on a definition and experimentation of a voice interface for cultural heritage inventory. This hybrid system includes signal processing, natural language techniques and knowledge modeling for future retrieval. We discuss the first results and give some points on future work.

متن کامل

Multilingual access to cultural heritage content on the Semantic Web

As the amount of cultural data available on the Semantic Web is expanding, the demand of accessing this data in multiple languages is increasing. Previous work on multilingual access to cultural heritage information has shown that mapping from ontologies to natural language requires at least two different steps: (1) mapping multilingual metadata to interoperable knowledge sources; (2) assigning...

متن کامل

Integrating Wiki Systems, Natural Language Processing, and Semantic Technologies for Cultural Heritage Data Management

Modern documents can easily be structured and augmented to have the characteristics of a semantic knowledge base. Many older documents may also hold a trove of knowledge that would deserve to be organized as such a knowledge base. In this chapter, we show that modern semantic technologies offer the means to make these heritage documents accessible by transforming them into a semantic knowledge ...

متن کامل

PATHS: A System for Accessing Cultural Heritage Collections

This paper describes a system for navigating large collections of information about cultural heritage which is applied to Europeana, the European Library. Europeana contains over 20 million artefacts with meta-data in a wide range of European languages. The system currently provides access to Europeana content with meta-data in English and Spanish. The paper describes how Natural Language Proce...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Language and Linguistics Compass

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2010